Improving molecular force fields across configurational space by combining supervised and unsupervised machine learning

نویسندگان

چکیده

The training set of atomic configurations is key to the performance any Machine Learning Force Field (MLFF) and, as such, selection determines applicability MLFF model for predictive molecular simulations. However, most atomistic reference datasets are inhomogeneously distributed across configurational space (CS), thus choosing randomly or according probability distribution data leads models whose accuracy mainly defined by common close-to-equilibrium in data. In this work, we combine unsupervised and supervised ML methods bypass inherent bias configurations, effectively widening range fullest capabilities dataset. To achieve goal, first cluster CS into subregions similar terms geometry energetics. We iteratively test a given on each subregion fill with representatives inaccurate parts CS. proposed approach has been applied small organic molecules alanine tetrapeptide, demonstrating an up two-fold decrease root mean squared errors force predictions these molecules. This result holds both kernel-based (sGDML GAP/SOAP models) deep neural networks (SchNet model). For latter, developed simultaneously improves energy forces, bypassing compromise be made when employing mixed energy/force loss functions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Supervised and Unsupervised

This work combines a set of available techniques {which could be further extended{ to perform noun sense disambiguation. We use several unsupervised techniques (Rigau et al., 1997) that draw knowledge from a variety of sources. In addition, we also apply a supervised technique, in order to show that supervised and unsupervised methods can be combined to obtain better results. This paper tries t...

متن کامل

Combining Unsupervised and Supervised Machine Learning to Build User Models for Intelligent Learning Environments

Traditional approaches to developing user models, especially for computer-based learning environments, are notoriously difficult and time-consuming because they rely heavily on expert-elicited knowledge about the target application and domain. Furthermore, because the necessary expert knowledge is application and domain specific, the entire model development process must be repeated for each ne...

متن کامل

Machine learning of accurate energy-conserving molecular force fields

Using conservation of energy-a fundamental property of closed classical and quantum mechanical systems-we develop an efficient gradient-domain machine learning (GDML) approach to construct accurate molecular force fields using a restricted number of samples from ab initio molecular dynamics (AIMD) trajectories. The GDML implementation is able to reproduce global potential energy surfaces of int...

متن کامل

Combining Supervised and Unsupervised Learning for GIS Classification

This paper presents a new hybrid learning algorithm for unsupervised classi cation tasks. We combined Fuzzy c-means learning algorithm and a supervised version of Minimerror to develop a hybrid incremental strategy allowing unsupervised classi cations. We applied this new approach to a real-world database in order to know if the information contained in unlabeled features of a Geographic Inform...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Chemical Physics

سال: 2021

ISSN: ['1520-9032', '1089-7690', '0021-9606']

DOI: https://doi.org/10.1063/5.0035530